Skip to content

Interactive Runtime: Queue, Interject, Thinking, etc works.#257

Open
Nate0-1999 wants to merge 63 commits intompfaffenberger:mainfrom
Nate0-1999:main
Open

Interactive Runtime: Queue, Interject, Thinking, etc works.#257
Nate0-1999 wants to merge 63 commits intompfaffenberger:mainfrom
Nate0-1999:main

Conversation

@Nate0-1999
Copy link
Copy Markdown

@Nate0-1999 Nate0-1999 commented Apr 7, 2026

Demo Video

Interactive runtime demo video

Summary

This PR introduces an interactive runtime that lets the user queue prompts and send interjects while their Code Puppy is already running.

The core goal is smooth live interaction across the runtime surfaces that matter in real use:

  • prompt input
  • queued prompts
  • interjects
  • ephemeral thinking / status / response preview
  • durable final transcript output
  • shell commands
  • multiple parallel agents
  • Wiggums of parallel agents
  • combinations in between

I tested these surfaces personally across macOS and Windows, and the interactive runtime / state handling feels solid.

The latest updates also address the CodeRabbit correctness comments without broadening the design:

  • active interactive runtime is always cleared on exit paths
  • queue-full interject rejection no longer cancels active work
  • prompt-only execution correctly unpacks (response, task)
  • background command lifecycle only emits completed on real success
  • shell runtime notifications cannot short-circuit cleanup
  • Antigravity auth no longer reports success if model registration fails
  • prompt-surface reasoning-tool suppression handles partial streamed prefixes

What this introduces

  • a prompt-surface interactive runtime
  • queueable prompts while a run is already active
  • interjects while a run is already active
  • ephemeral live thinking / status / response preview before durable output is printed
  • smooth coordination with shell commands
  • smooth coordination with multiple parallel agents
  • smooth coordination with Wiggum flows and Wiggums of parallel agents
  • foreground ownership of ephemeral UI so background/sub-agent activity does not clobber it
  • terminal-safe live rendering behavior across macOS and Windows
  • graceful handling for malformed replace_in_file payloads

Tested surfaces

I tested:

  • prompt only
  • prompt + queue
  • prompt + interject
  • prompt + reasoning
  • prompt + tool calls
  • prompt + shell commands
  • prompt + shell carriage-return progress
  • prompt + Wiggum
  • prompt + parallel agents
  • prompt + Wiggum + parallel agents
  • prompt + shell + interject
  • prompt + shell + queue
  • prompt + parallel agents + foreground ephemeral UI
  • macOS terminal behavior
  • Windows terminal behavior

Guardrails and regression coverage

The guardrails and regression coverage for this work enforce a few core runtime rules:

  • reasoning should surface through structured reasoning output, not low-level tool-call token spam
  • mutable live tool progress and shell carriage-return progress should stay ephemeral and update in place instead of polluting the durable transcript
  • live response text can appear ephemerally while the run is active, but the final durable AGENT RESPONSE still renders once
  • durable structured outputs still render above the prompt normally
  • foreground ephemeral UI is foreground-only, so parallel sub-agents cannot overwrite or clear it
  • terminals that cannot safely support live redraw should degrade cleanly instead of leaking raw ANSI / control-sequence garbage
  • Windows clipboard fallback should continue to work cleanly
  • malformed file-edit payloads should fail as normal tool errors instead of bubbling up as raw internal exceptions

Notes

This is not a broad product redesign. It introduces the interactive runtime and makes the live combinations between prompt, queue, interject, shell, Wiggum, and parallel agents behave correctly.

Summary by CodeRabbit

  • New Features

    • Interactive prompt queuing and interjection with a configurable queue limit (/set queue_limit).
    • Background-friendly OAuth flows now run as cancellable background commands.
  • Bug Fixes

    • Cleaner live-progress updates: carriage-return progress, ephemeral status/preview, and spinner behavior no longer corrupt the transcript or duplicate final responses.
    • Improved shell output handling across terminals and paste fallback behavior.
  • Documentation

    • Added Implementation Guardrails and Interactive Regression Checklist.

Nate0-1999 and others added 30 commits February 26, 2026 23:56
…ctions, and eliminate noise logs for a truly smooth UX
Back out the most recent interject banner rendering experiment while keeping the safer queue/interject runtime path intact. This isolates the shell/queue freeze and missing-visible-interject regressions for validation against capture artifacts.

Made-with: Cursor
revert latest interject visibility tweak
…ig of cleaning up and it's still laggy but great improvement.

Made-with: Cursor
sometime the queue doesn't trigger and the interject printing has a b…
Restore shell streaming during foreground commands and hardened queueing and interject and queuing messages.
Nate Oswalt and others added 25 commits March 10, 2026 18:39
…policy

Refine shell, wiggum, and paused queue behavior
Improve prompt-surface streaming previews
Lean terminal compat and protect foreground ephemeral UI
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

Adds an always-on prompt surface and runtime, routing streamed agent/tool output to prompt-local ephemeral UI, introduces BackgroundInteractiveCommand for background OAuth/work, cooperative cancellation primitives, legacy-to-bus bridging, terminal capability utilities, and extensive tests reflecting the new interactive flows.

Changes

Cohort / File(s) Summary
Documentation & Guardrails
IMPLEMENTATION_GUARDRAILS.md, docs/INTERACTIVE_REGRESSION_CHECKLIST.md
New docs specifying required prompt-surface rendering/streaming behaviors and regression checks.
Interactive Runtime & Commands
code_puppy/command_line/interactive_runtime.py, code_puppy/command_line/interactive_command.py
New PromptRuntimeState and BackgroundInteractiveCommand for queueing, ephemeral UI, above-prompt execution, and cooperative cancelation.
CLI Runner & Prompt Flow
code_puppy/cli_runner.py, code_puppy/command_line/prompt_toolkit_completion.py
Interactive-mode overhaul: unified renderer + legacy bridge, PromptSubmission model, chooser/pending-submission flow, queue/interject handling, spinner seeding, and prompt-surface integration.
Event Streaming & Agents
code_puppy/agents/event_stream_handler.py, code_puppy/agents/base_agent.py
Prompt-surface-aware streaming: accumulate previews, route tool progress to ephemeral status, suppress duplicate prints, merge tool-name deltas, and silence the previous "Cancelled" emission.
Messaging & Rendering
code_puppy/messaging/legacy_bridge.py, code_puppy/messaging/messages.py, code_puppy/messaging/renderers.py, code_puppy/messaging/rich_renderer.py
New LegacyQueueToBusBridge, LegacyQueueMessage/AgentListMessage, centralized legacy rendering helper, and rich renderer prompt-surface awareness with above-prompt rendering and CR/progress routing.
Terminal & Spinner Utilities
code_puppy/terminal_utils.py, code_puppy/messaging/spinner/console_spinner.py, code_puppy/messaging/spinner/__init__.py
Terminal profile helpers, live-update gating, centralized CR/line-clear helper, spinner throttling/pausing using helpers, and spinner context invalidation via active runtime.
OAuth & Cancellation
code_puppy/plugins/oauth_control.py, code_puppy/plugins/*_oauth/*
Shared wait/cancel helpers; OAuth flows accept cancel events, return booleans, and expose start_*_oauth_setup entry points used via BackgroundInteractiveCommand.
Shell Integration & Tools
code_puppy/tools/command_runner.py, code_puppy/tools/agent_tools.py, code_puppy/tools/file_modifications.py, code_puppy/config.py
CWD normalization, runtime notification for shell exec, structured AgentListMessage emission, replacement payload validation, and new queue_limit config accessor with parsing/clamping.
Renderer/Console Bridges
code_puppy/messaging/legacy_bridge.py, code_puppy/messaging/renderers.py
Bridge forwards legacy queue into message bus; legacy rendering extracted and shared between interactive/sync renderers.
Tests
tests/... (many files; see changed tests list)
Extensive new/updated tests for prompt-surface behavior, BackgroundInteractiveCommand returns, OAuth cancellation flows, queue/interject lifecycle, terminal capability detection, legacy bridge, spinner invalidation, and many interactive-mode integration scenarios.

Sequence Diagram

sequenceDiagram
    participant User
    participant PromptSurface as Prompt Surface
    participant CLI as CLI Runner
    participant Agent
    participant EventStream as Event Stream Handler
    participant Renderer as Rich Renderer / Bus

    User->>PromptSurface: submit (submit/queue/interject)
    PromptSurface->>CLI: PromptSubmission
    CLI->>CLI: normalize & dispatch
    CLI->>Agent: run_prompt_with_attachments()
    Agent->>EventStream: emit streaming deltas (TextPart, ToolCall, Shell, etc.)
    EventStream->>EventStream: is prompt-surface active?
    alt prompt-surface active
        EventStream->>PromptSurface: set_ephemeral_status / set_ephemeral_preview
        PromptSurface->>PromptSurface: update preview / invalidate UI
    else no prompt-surface
        EventStream->>Renderer: print to console (streaming)
        Renderer->>User: visible output
    end
    Agent->>EventStream: final AGENT RESPONSE
    EventStream->>PromptSurface: clear ephemeral status/preview
    EventStream->>Renderer: render final AGENT RESPONSE above prompt (via run_above_prompt)
    Renderer->>User: rendered final response (durable transcript)
    CLI->>PromptSurface: ready for next input
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • PR #180 — Overlaps changes to prompt_toolkit_completion.py (prompt/submission/completion flow) and interactive runtime integration.
  • PR #167 — Related edits to event-stream handling and streaming/banner rendering behavior.
  • PR #163 — Overlaps messaging types and rich renderer adjustments (messages/rich_renderer).

Poem

🐰✨ A prompt that hums beneath my paws,
Ephemeral previews without a pause,
Background hops and cancellable quests,
Streams tuck neatly into cozy nests,
The rabbit applauds—terminal applause!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly describes the main feature being introduced: an interactive runtime enabling queue, interject, and thinking functionality. It is specific, concise, and directly reflects the core changes in the changeset.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
code_puppy/plugins/antigravity_oauth/register_callbacks.py (1)

258-266: ⚠️ Potential issue | 🟠 Major

Only treat auth as successful after model configuration succeeds.

Lines 258-266 merely warn when add_models_to_config() fails, but _perform_authentication() still returns True. That makes Line 280 switch to antigravity-gemini-3-pro-high even on a first-time auth where that model was never registered, leaving the session pointed at an unusable model.

Also applies to: 272-282

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code_puppy/plugins/antigravity_oauth/register_callbacks.py` around lines 258
- 266, The auth flow currently treats authentication as successful even if
add_models_to_config(...) fails; update _perform_authentication to only return
True when both the auth step and add_models_to_config succeed — if
add_models_to_config returns False, emit the warning (emit_warning) and return
False so the session does not switch to an unregistered model; locate the block
that calls add_models_to_config and adjust the control flow so success requires
both the access token/project_id handling and the model registration call (refer
to add_models_to_config, _perform_authentication, emit_success, emit_warning,
and emit_info to find the correct spot).
🧹 Nitpick comments (2)
tests/tools/test_file_modifications_extended.py (1)

225-244: Good test coverage for the validation path.

The test correctly verifies the KeyError handling when old_str is missing. The mock Agent pattern effectively captures the registered tool for direct invocation.

Consider adding companion tests for related validation scenarios:

🧪 Optional: Additional test cases for fuller coverage
def test_register_replace_in_file_rejects_missing_new_str(self, tmp_path):
    registered = {}

    class Agent:
        def tool(self, fn):
            registered[fn.__name__] = fn
            return fn

    register_replace_in_file(Agent())
    fn = registered["replace_in_file"]

    result = fn(
        Mock(),
        file_path=str(tmp_path / "test.py"),
        replacements=[{"old_str": "original"}],  # missing new_str
    )

    assert result["success"] is False
    assert result["changed"] is False
    assert "new_str" in result["message"]


def test_register_replace_in_file_rejects_non_string_values(self, tmp_path):
    registered = {}

    class Agent:
        def tool(self, fn):
            registered[fn.__name__] = fn
            return fn

    register_replace_in_file(Agent())
    fn = registered["replace_in_file"]

    result = fn(
        Mock(),
        file_path=str(tmp_path / "test.py"),
        replacements=[{"old_str": 123, "new_str": "updated"}],  # non-string old_str
    )

    assert result["success"] is False
    assert result["changed"] is False
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/tools/test_file_modifications_extended.py` around lines 225 - 244, Add
companion tests to cover the other validation paths for
register_replace_in_file: create two new test functions named
test_register_replace_in_file_rejects_missing_new_str and
test_register_replace_in_file_rejects_non_string_values that mirror the existing
test_register_replace_in_file_rejects_missing_old_str structure, register the
tool via register_replace_in_file(Agent()), call the registered
"replace_in_file" function with replacements missing "new_str" in the first test
and with a non-string "old_str" (e.g., integer) in the second, and assert
result["success"] is False, result["changed"] is False and that the error
message contains "new_str" for the first test (and appropriate validation
indication for the second).
code_puppy/command_line/core_commands.py (1)

176-218: Consider adding explicit return type annotation.

The return type was removed, but the function now returns either bool or BackgroundInteractiveCommand. Adding an explicit union type improves readability and IDE support.

📝 Suggested type annotation
-def handle_tutorial_command(command: str):
+def handle_tutorial_command(command: str) -> bool | BackgroundInteractiveCommand:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code_puppy/command_line/core_commands.py` around lines 176 - 218, The
function handle_tutorial_command currently returns either a bool or a
BackgroundInteractiveCommand but lacks a type annotation; update its signature
to include an explicit return type (e.g., Union[bool,
BackgroundInteractiveCommand]) and add the necessary typing import (from typing
import Union) so IDEs and linters understand the union return type; keep the
existing behavior and symbols (handle_tutorial_command and
BackgroundInteractiveCommand) unchanged elsewhere.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@code_puppy/agents/event_stream_handler.py`:
- Around line 140-150: The merge function _merge_tool_name can leak partial
streaming names (e.g. "agent_share_your_reasoning") before the final value
arrives; change _merge_tool_name to detect the special reasoning tool string
(e.g. "agent_share_your_reasoning") and avoid returning any intermediate/partial
concatenation that would be a prefix of that special name — instead keep
returning current_name until the merged result equals the full special name, and
only then return the full name; apply the same suppression logic in the other
merging code path noted (lines ~385-404) so partial prefixes of the reasoning
tool are never emitted.

In `@code_puppy/cli_runner.py`:
- Around line 572-575: The code registers a runtime with
register_active_interactive_runtime(runtime) but only clears it on the normal
exit path, which can leave a stale global runtime after exceptions; to fix,
ensure you always call clear_active_interactive_runtime(runtime) in a finally
block paired with the try that follows the register so the runtime is removed
regardless of errors (referencing PromptRuntimeState,
register_active_interactive_runtime, clear_active_interactive_runtime, and
runtime.mark_idle) — move or add a finally that invokes
clear_active_interactive_runtime(runtime) and performs any necessary cleanup
after runtime.mark_idle().
- Around line 1194-1202: The code currently ignores the return value of
BackgroundInteractiveCommand.run(), so even when it returns False (indicating
failure) the code still emits a "completed" success lifecycle; change the await
asyncio.to_thread(command_result.run, command_result.cancel_event) call to
capture the return value (e.g., result = await
asyncio.to_thread(command_result.run, command_result.cancel_event)) and only
call emit_interject_queue_lifecycle(runtime, "completed", item=source_item,
level="success") when the captured result is truthy and
command_result.cancel_event.is_set() is False (i.e., if result and not
command_result.cancel_event.is_set()) so failed runs that return False do not
emit a success completion.
- Around line 1289-1299: The code cancels the active run via
cancel_active_run("interject") before checking whether runtime.request_interject
succeeded, which can drop the current task if the queue is full; to fix, call
runtime.request_interject(stripped_task,
allow_command_dispatch=allow_command_dispatch) first and inspect its ok result
(and position/item), and only call cancel_active_run("interject") if ok is True;
if ok is False, call emit_warning with the queue limit message (using
get_queue_limit()) and do not cancel the active run. Ensure this change is
applied where requested_action == "interject" and references the same symbols
(requested_action, runtime.request_interject, cancel_active_run, emit_warning,
get_queue_limit).
- Around line 1859-1867: The code treats the return of
run_prompt_with_attachments as a single object but that function returns a tuple
(result, task); change the call to unpack it (e.g., response, task = await
run_prompt_with_attachments(...)), then preserve the existing None check on
response (if response is None: return) before using response.output to set
agent_response; ensure prompt-only mode which returns (None, None) is handled by
the unpack and early return to avoid accessing response.output when response is
None.

In `@code_puppy/tools/command_runner.py`:
- Around line 1218-1225: The call to
get_active_interactive_runtime().notify_shell_started() is performed before
entering the try/finally, so if notify_shell_started or later
notify_shell_finished raises, it can short-circuit spinner/keyboard cleanup;
move notify_shell_started() into the main try block after acquiring keyboard
context (or after deciding release_keyboard_context) and ensure
notify_shell_finished() is invoked from the finally block but wrapped in its own
try/except to swallow/log any exceptions so they don't prevent cleanup;
reference get_active_interactive_runtime, notify_shell_started,
notify_shell_finished, _acquire_keyboard_context, and the
release_keyboard_context flag when applying the change.
- Around line 288-295: The _normalize_shell_cwd helper currently strips
leading/trailing whitespace which alters valid paths; change it to only collapse
whitespace-only inputs to None by returning None when cwd is None or cwd.strip()
is empty, otherwise return the original cwd unchanged (preserving any
intentional leading/trailing spaces) so callers receive the original non-empty
value.

In `@tests/test_cli_runner_full_coverage.py`:
- Around line 130-142: The prompt_side_effect helper currently wraps every
non-PromptSubmission value into PromptSubmission(action="submit", text=value),
which converts a test-provided None (meant to represent cancellation) into a
submission; modify prompt_side_effect so that after resolving value from
input_fn it returns None unchanged if value is None, returns the value unchanged
if isinstance(value, PromptSubmission), and only otherwise wraps into
PromptSubmission(action="submit", text=value); reference prompt_side_effect,
input_fn and PromptSubmission when making the change.

---

Outside diff comments:
In `@code_puppy/plugins/antigravity_oauth/register_callbacks.py`:
- Around line 258-266: The auth flow currently treats authentication as
successful even if add_models_to_config(...) fails; update
_perform_authentication to only return True when both the auth step and
add_models_to_config succeed — if add_models_to_config returns False, emit the
warning (emit_warning) and return False so the session does not switch to an
unregistered model; locate the block that calls add_models_to_config and adjust
the control flow so success requires both the access token/project_id handling
and the model registration call (refer to add_models_to_config,
_perform_authentication, emit_success, emit_warning, and emit_info to find the
correct spot).

---

Nitpick comments:
In `@code_puppy/command_line/core_commands.py`:
- Around line 176-218: The function handle_tutorial_command currently returns
either a bool or a BackgroundInteractiveCommand but lacks a type annotation;
update its signature to include an explicit return type (e.g., Union[bool,
BackgroundInteractiveCommand]) and add the necessary typing import (from typing
import Union) so IDEs and linters understand the union return type; keep the
existing behavior and symbols (handle_tutorial_command and
BackgroundInteractiveCommand) unchanged elsewhere.

In `@tests/tools/test_file_modifications_extended.py`:
- Around line 225-244: Add companion tests to cover the other validation paths
for register_replace_in_file: create two new test functions named
test_register_replace_in_file_rejects_missing_new_str and
test_register_replace_in_file_rejects_non_string_values that mirror the existing
test_register_replace_in_file_rejects_missing_old_str structure, register the
tool via register_replace_in_file(Agent()), call the registered
"replace_in_file" function with replacements missing "new_str" in the first test
and with a non-string "old_str" (e.g., integer) in the second, and assert
result["success"] is False, result["changed"] is False and that the error
message contains "new_str" for the first test (and appropriate validation
indication for the second).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 05a39e5c-5f5a-4d69-9686-b49566313c54

📥 Commits

Reviewing files that changed from the base of the PR and between 6594344 and f0268c4.

📒 Files selected for processing (57)
  • IMPLEMENTATION_GUARDRAILS.md
  • code_puppy/agents/base_agent.py
  • code_puppy/agents/event_stream_handler.py
  • code_puppy/cli_runner.py
  • code_puppy/command_line/command_handler.py
  • code_puppy/command_line/config_commands.py
  • code_puppy/command_line/core_commands.py
  • code_puppy/command_line/interactive_command.py
  • code_puppy/command_line/interactive_runtime.py
  • code_puppy/command_line/prompt_toolkit_completion.py
  • code_puppy/config.py
  • code_puppy/messaging/legacy_bridge.py
  • code_puppy/messaging/messages.py
  • code_puppy/messaging/renderers.py
  • code_puppy/messaging/rich_renderer.py
  • code_puppy/messaging/spinner/__init__.py
  • code_puppy/messaging/spinner/console_spinner.py
  • code_puppy/plugins/antigravity_oauth/register_callbacks.py
  • code_puppy/plugins/chatgpt_oauth/oauth_flow.py
  • code_puppy/plugins/chatgpt_oauth/register_callbacks.py
  • code_puppy/plugins/claude_code_oauth/register_callbacks.py
  • code_puppy/plugins/oauth_control.py
  • code_puppy/terminal_utils.py
  • code_puppy/tools/agent_tools.py
  • code_puppy/tools/command_runner.py
  • code_puppy/tools/file_modifications.py
  • docs/INTERACTIVE_REGRESSION_CHECKLIST.md
  • tests/agents/test_event_stream_handler.py
  • tests/command_line/test_add_model_menu_coverage.py
  • tests/command_line/test_config_commands_full_coverage.py
  • tests/command_line/test_core_commands_full_coverage.py
  • tests/command_line/test_prompt_toolkit_coverage.py
  • tests/command_line/test_tutorial.py
  • tests/messaging/spinner/test_spinner_init.py
  • tests/messaging/test_legacy_bridge.py
  • tests/messaging/test_rich_renderer.py
  • tests/plugins/conftest.py
  • tests/plugins/test_antigravity_callbacks_coverage.py
  • tests/plugins/test_antigravity_register_callbacks.py
  • tests/plugins/test_chatgpt_oauth_coverage.py
  • tests/plugins/test_chatgpt_oauth_integration.py
  • tests/plugins/test_claude_code_oauth_callbacks.py
  • tests/plugins/test_claude_code_oauth_coverage.py
  • tests/test_agent_tools_coverage.py
  • tests/test_cli_runner_coverage.py
  • tests/test_cli_runner_full_coverage.py
  • tests/test_command_overhaul_targeted.py
  • tests/test_config.py
  • tests/test_config_and_storage_edge_cases.py
  • tests/test_config_full_coverage.py
  • tests/test_console_spinner_coverage.py
  • tests/test_prompt_toolkit_completion.py
  • tests/test_terminal_utils.py
  • tests/test_terminal_utils_comprehensive.py
  • tests/tools/browser/test_remaining_coverage.py
  • tests/tools/test_command_runner_full_coverage.py
  • tests/tools/test_file_modifications_extended.py

Comment thread code_puppy/agents/event_stream_handler.py
Comment thread code_puppy/cli_runner.py
Comment thread code_puppy/cli_runner.py Outdated
Comment thread code_puppy/cli_runner.py
Comment on lines +1289 to +1299
if requested_action == "interject":
log_event("interject_banner", text=stripped_task)
await cancel_active_run("interject")
ok, position, item = runtime.request_interject(
stripped_task,
allow_command_dispatch=allow_command_dispatch,
)
if not ok:
emit_warning(
f"Queue full ({get_queue_limit()}). Cannot interject right now."
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't cancel the current run before you know the interject fits.

At Line 1291 the active task is cancelled before Lines 1292-1296 check queue capacity. If the queue is already full, the interject is rejected but the user's in-flight work has already been stopped.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@code_puppy/cli_runner.py` around lines 1289 - 1299, The code cancels the
active run via cancel_active_run("interject") before checking whether
runtime.request_interject succeeded, which can drop the current task if the
queue is full; to fix, call runtime.request_interject(stripped_task,
allow_command_dispatch=allow_command_dispatch) first and inspect its ok result
(and position/item), and only call cancel_active_run("interject") if ok is True;
if ok is False, call emit_warning with the queue limit message (using
get_queue_limit()) and do not cancel the active run. Ensure this change is
applied where requested_action == "interject" and references the same symbols
(requested_action, runtime.request_interject, cancel_active_run, emit_warning,
get_queue_limit).

Comment thread code_puppy/cli_runner.py Outdated
Comment thread code_puppy/tools/command_runner.py Outdated
Comment thread code_puppy/tools/command_runner.py
Comment thread tests/test_cli_runner_full_coverage.py
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
tests/test_cli_runner_full_coverage.py (2)

1521-1528: Replace long fixed sleep with event-driven blocking to reduce CI flakiness.

At Line 1527, await asyncio.sleep(10) makes this test unnecessarily slow and fragile if cancellation timing shifts. Prefer waiting on an event/future that is cancelled/fulfilled by the test flow.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_cli_runner_full_coverage.py` around lines 1521 - 1528, The test
uses a long fixed sleep inside the fake_run coroutine (function fake_run) when
handling the "first task" prompt, which should be replaced by waiting on an
event to avoid timing flakiness: create an asyncio.Event or Future (e.g.,
first_task_continue) and in fake_run, replace await asyncio.sleep(10) with await
first_task_continue.wait() (or awaiting the Future); then signal or cancel that
event/future from the test flow when you want the first task to continue or be
cancelled (use the existing first_task_started Event to know when to trigger
it). Update references to first_task_started and any assertions around
render_order/started_prompts so the test coordinates via the new
first_task_continue event instead of a fixed sleep.

190-192: Tighten lifecycle tests to assert single emission, not just payload text.

These tests currently validate emitted text but not multiplicity. If duplicate lifecycle emits are introduced, they would still pass. Add an explicit single-call assertion in each positive case.

Suggested assertion hardening
     emitted = message_bus.emit.call_args[0][0]
     assert emitted.text == "[INTERJECT] stopping current work: steer now"
+    message_bus.emit.assert_called_once()

Apply similarly to the queue and command-completion lifecycle tests.

Also applies to: 212-214, 335-337

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_cli_runner_full_coverage.py` around lines 190 - 192, The test
currently only checks the emitted payload text but not that the emitter was
invoked exactly once; update the test(s) that inspect message_bus.emit (e.g.,
the block capturing emitted = message_bus.emit.call_args[0][0]) to also assert a
single emission using message_bus.emit.assert_called_once() (or assert
message_bus.emit.call_count == 1) before examining emitted.text, and apply the
same single-call assertion to the similar cases at the other locations (the
queue and command-completion lifecycle tests).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/test_cli_runner_full_coverage.py`:
- Around line 1521-1528: The test uses a long fixed sleep inside the fake_run
coroutine (function fake_run) when handling the "first task" prompt, which
should be replaced by waiting on an event to avoid timing flakiness: create an
asyncio.Event or Future (e.g., first_task_continue) and in fake_run, replace
await asyncio.sleep(10) with await first_task_continue.wait() (or awaiting the
Future); then signal or cancel that event/future from the test flow when you
want the first task to continue or be cancelled (use the existing
first_task_started Event to know when to trigger it). Update references to
first_task_started and any assertions around render_order/started_prompts so the
test coordinates via the new first_task_continue event instead of a fixed sleep.
- Around line 190-192: The test currently only checks the emitted payload text
but not that the emitter was invoked exactly once; update the test(s) that
inspect message_bus.emit (e.g., the block capturing emitted =
message_bus.emit.call_args[0][0]) to also assert a single emission using
message_bus.emit.assert_called_once() (or assert message_bus.emit.call_count ==
1) before examining emitted.text, and apply the same single-call assertion to
the similar cases at the other locations (the queue and command-completion
lifecycle tests).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4c3081e1-63b5-4fa1-8bb4-d0550322930e

📥 Commits

Reviewing files that changed from the base of the PR and between f0268c4 and f6db02d.

📒 Files selected for processing (10)
  • code_puppy/agents/event_stream_handler.py
  • code_puppy/cli_runner.py
  • code_puppy/plugins/antigravity_oauth/register_callbacks.py
  • code_puppy/tools/command_runner.py
  • tests/agents/test_event_stream_handler.py
  • tests/plugins/test_antigravity_callbacks_coverage.py
  • tests/plugins/test_antigravity_register_callbacks.py
  • tests/test_cli_runner_coverage.py
  • tests/test_cli_runner_full_coverage.py
  • tests/tools/test_command_runner_full_coverage.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • code_puppy/tools/command_runner.py
  • code_puppy/plugins/antigravity_oauth/register_callbacks.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant